average percentage
Satyrn: A Platform for Analytics Augmented Generation
Sterbentz, Marko, Barrie, Cameron, Shahi, Shubham, Dutta, Abhratanu, Hooshmand, Donna, Pack, Harper, Hammond, Kristian J.
Large language models (LLMs) are capable of producing documents, and retrieval augmented generation (RAG) has shown itself to be a powerful method for improving accuracy without sacrificing fluency. However, not all information can be retrieved from text. We propose an approach that uses the analysis of structured data to generate fact sets that are used to guide generation in much the same way that retrieved documents are used in RAG. This analytics augmented generation (AAG) approach supports the ability to utilize standard analytic techniques to generate facts that are then converted to text and passed to an LLM. We present a neurosymbolic platform, Satyrn that leverages AAG to produce accurate, fluent, and coherent reports grounded in large scale databases. In our experiments, we find that Satyrn generates reports in which over 86% accurate claims while maintaining high levels of fluency and coherence, even when using smaller language models such as Mistral-7B, as compared to GPT-4 Code Interpreter in which just 57% of claims are accurate.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Illinois > Lake County (0.15)
- North America > United States > Oregon (0.05)
- (16 more...)
- Education (1.00)
- Government > Regional Government > North America Government > United States Government (0.93)
- Law (0.93)
Exploring Value Biases: How LLMs Deviate Towards the Ideal
Sivaprasad, Sarath, Kaushik, Pramod, Abdelnabi, Sahar, Fritz, Mario
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact. Understanding the non-deliberate(ive) mechanism of LLMs in giving responses is essential in explaining their performance and discerning their biases in real-world applications. This is analogous to human studies, where such inadvertent responses are referred to as sampling. We study this sampling of LLMs in light of value bias and show that the sampling of LLMs tends to favour high-value options. Value bias corresponds to this shift of response from the most likely towards an ideal value represented in the LLM. In fact, this effect can be reproduced even with new entities learnt via in-context prompting. We show that this bias manifests in unexpected places and has implications on relevant application scenarios, like choosing exemplars. The results show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
- North America > United States > New York (0.14)
- North America > United States > Montana (0.04)
- North America > United States > Hawaii (0.04)
- (2 more...)
- Transportation > Ground > Road (1.00)
- Leisure & Entertainment (1.00)
- Education (1.00)
- (3 more...)
No link found between Japan COVID-19 school closures and achievement test results
Achievement tests for elementary and junior high school children across Japan showed no correlation between the percentages of correct answers and the lengths of coronavirus school closures, education ministry data showed Tuesday. Gaps in the average percentages of correct answers between the prefectures were also small. The tests for elementary school sixth-graders and junior high school third-graders were carried out in May after the cancellation last year due to blanket school closures triggered by the COVID-19 crisis. The tests measured achievement in Japanese language and arithmetic for elementary school students and in Japanese and mathematics for junior high school students. Some 1.97 million students at about 29,000 public and private schools participated, covering almost all public schools and about half of private schools in Japan.
Coach2vec: autoencoding the playing style of soccer coaches
Cintia, Paolo, Pappalardo, Luca
Capturing the playing style of professional soccer coaches is a complex, and yet barely explored, task in sports analytics. Nowadays, the availability of digital data describing every relevant spatio-temporal aspect of soccer matches, allows for capturing and analyzing the playing style of players, teams, and coaches in an automatic way. In this paper, we present coach2vec, a workflow to capture the playing style of professional coaches using match event streams and artificial intelligence. Coach2vec extracts ball possessions from each match, clusters them based on their similarity, and reconstructs the typical ball possessions of coaches. Then, it uses an autoencoder, a type of artificial neural network, to obtain a concise representation (encoding) of the playing style of each coach. Our experiments, conducted on soccer-logs describing the last four seasons of the Italian first division, reveal interesting similarities between prominent coaches, paving the road to the simulation of playing styles and the quantitative comparison of professional coaches.
- Workflow (0.53)
- Research Report (0.40)
Automated Detection of Rest Disruptions in Critically Ill Patients
Iyengar, Vasundhra, Bihorac, Azra, Rashidi, Parisa
Sleep has been shown to be an indispensable and important component of patients recovery process. Nonetheless, sleep quality of patients in the Intensive Care Unit (ICU) is often low, due to factors such as noise, pain, and frequent nursing care activities. Frequent sleep disruptions by the medical staff and/or visitors at certain times might lead to disruption of patient sleep-wake cycle and can also impact the severity of pain. Examining the association between sleep quality and frequent visitation has been difficult, due to lack of automated methods for visitation detection. In this study, we recruited 38 patients to automatically assess visitation frequency from captured video frames. We used the DensePose R-CNN (ResNet-101) model to calculate the number of people in the room in a video frame. We examined when patients are interrupted the most, and we examined the association between frequent disruptions and patient outcomes on pain and length of stay.
The Most In Demand Tech Skills for Data Scientists - KDnuggets
In fall of 2018 I analyzed the most in demand skills and technologies for data scientists. That article resonated with folks. It has over 11,000 claps on Medium, was translated into several languages, and was the most popular story on KD Nuggets for November 2018. A little over a year has passed. By the end of this article you'll know which technologies are becoming more popular with employers and which are becoming less popular.
- Information Technology > Communications > Social Media (0.77)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)
- Information Technology > Data Science > Data Mining > Big Data (0.31)
Exploration in NetHack With Secret Discovery
Campbell, Jonathan C., Verbrugge, Clark
Abstract--Roguelike games generally feature exploration problems as a critical, yet often repetitive element of gameplay. This paper presents an algorithmic approach to exploration of roguelike dungeon environments. Our design aims to minimize exploration time, balancing coverage and discovery of secret areas with resource cost. Our algorithm is based on the concept of occupancy maps popular in robotics, adapted to encourage efficient discovery of secret access points. Through extensive experimentation on NetHack maps we show that this technique is significantly more efficient than simpler greedy approaches and an existing automated player. We further investigate optimized parameterization for the algorithm through a comprehensive data analysis. These results point towards better automation for players as well as heuristics applicable to fully automated gameplay. ANY video games place emphasis on the idea of exploration of the unknown. In roguelikes, a popular subset of Role-Playing Games (RPGs), exploration of the game space is a key game mechanic, essential to resource acquisition and game progress. The high level of repetition involved, however, makes automation of the exploration process useful, as an assistance in game design, for relieving player tedium in relatively safe levels or under casual play, and to ease control requirements for those operating with reduced interfaces [1]. Basic forms of automated exploration are found in several roguelikes, including the popular Dungeon Crawl Stone Soup. Even with full information, however, ensuring complete coverage can result in significant inefficiency, with coverage improvement coming at greater cost as exploration continues [2]. Diminishing returns are further magnified in the presence of "secret rooms," areas which must be intentionally searched for at additional, nontrivial resource cost, and which are a common feature of roguelike games. In such contexts, the complexity is less driven by the need to be thorough, and more given by the need to balance time spent exploring with respect to amount of benefit accrued (area revealed, items collected). In this work we present a novel algorithm for exploration of an initially unknown environment. Our design aims to accommodate features common to roguelike games. In particular, we aim for an efficient, balanced approach to exploration, considering the cost of further exploration in relation to the potential benefit.
- North America > Canada > Quebec > Montreal (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York (0.04)
- Europe > Czechia > Prague (0.04)
Imbalanced Datasets
Imagine you are a medical professional who is training a classifier to detect whether an individual has an extremely rare disease. You train your classifier, and it yields 99.9% accuracy on your test set. You're overcome with joy by these results, but when you check the labels outputted by the classifier, you see it always outputted "No Disease," regardless of the patient data. Because the disease is extremely rare, there were only a handful of patients with the disease in your dataset compared the thousands of patients without the disease. Because over 99.9% of the patients in your dataset don't have the disease, any classifier can achieve an impressively high accuracy simply by returning "No Disease" to every new patient.